Controlling an Agent by Focusing its Attention on Interactivelly Selected Patterns

نویسندگان

  • Sébastien JODOGNE
  • Justus H. PIATER
چکیده

Designing robotic controllers can quickly become a challenging problem. Indeed, such controllers have to face a huge number of possible inputs that can be noisy (e.g. think of visual inputs as furnished by cameras), to select actions among a continuous set, and to automatically adapt themselves to evolving or stochastic environmental conditions. In realworld robotic applications, it is difficult to model the environment formally and to specify how to solve a given task directly in a programming language. On the other hand, living beings face such problems everyday, and are able to learn how to solve them efficiently. Some research works have therefore tried to mimic natural strategies for solving robotic tasks. Neuropsychological evidence shows that the natural learning process is interactive: after each interaction with the environment, the agent gets a reward signal, called the reinforcement, which measures its performances. The goal of the agent is to maximize its reinforcements over time. As the agent observes the consequences of its reactions on the obtained reinforcements, it gains more and more expertise on its task, which ultimately enables it to act optimally. It is important to note that in such a setup, the agent is never directly told the optimal action to choose: there is no explicit external teacher. As an example, consider the task of a robotic agent escaping from a discrete maze: the reward could be zero until it manages to find the exit of the maze, in which case it obtains a positive reinforcement. Note that in the latter task, the interest of doing some action can appear only a long time after the interaction. This is called the delayed reward problem. Those ideas gave rise to the algorithmic theory of Reinforcement Learning (RL) [1], the goal of which is to solve closed-loop adaptive control problems without a model by analyzing the reinforcements earned during a sequence of interactions. The major advantages of the RL protocol are that it is fully automatic, and that it imposes weak requirements on the environment. RL can be applied in any environment that obeys a discrete-time Markovian probabilistic dynamics, and upon which it is possible to define a reinforcement function r(s,a) that gives for each state s of the environment and each action a, a numerical measure of the worthiness of doing this action in this state.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Personalized nutrition and its roles on some metabolic disorders: A narrative review

Introduction: Considering an individual’s characteristics such as genetics along with other characteristics and dietary habits can help to provide an effective diet for prevention and controlling metabolic disorders. Accordingly, in the present study, we aimed to review evidence on personalized nutrition (PN) and its roles in metabolic disorders. Materials and Methods: In the present narrative ...

متن کامل

Advertisements Evaluation Focusing on Human Face

Given the importance of advertising and the massive expenditures on it, evaluating its effectiveness is one of the key questions in marketing. Also endorser as the most commonly used communication tool, allocates a large portion of advertising costs and its use is still under development. In this project, we use Communicational and Observational approaches to explore the impact of utilizing ‘hu...

متن کامل

The Effect of Attention on Quiet Eye Behavior and Accuracy of Execution on a Targeting Task

The purpose of this study was to determine the effect of focusing attention on quiet eye behavior and accuracy of execution on dart throwing skills. For this purpose, 20 male students in dart beginner (age range 19-22 years old) were voluntarily selected. All participants performed external and internal attention instructions in a counterbalanced manner. Thus, Participants first made 10 attempt...

متن کامل

Synthesis of MgO Nanoparticles and Their Antibacterial Properties on Three Food Poisoning Causing Bacteria

Background: Application of nanoparticles in the removal of pathogenic bacteria is very important. The use of these materials can be appropriate for controlling pathogens and food-borne diseases. The purpose of this study was to synthesize magnesium oxide nanoparticles and investigate its antibacterial effect on several bacteria causing food poisoning. Materials and Methods: Oxide magnesium nan...

متن کامل

Assessment and Comparing of Hospital Performance Using “Accreditation Pattern”, “Organizational Excellence Pattern” and Program Chain Patterns

Introduction: Hospital performance measurement is an essential part for providing feedback on the efficacy and effectiveness of services. The purpose of this study was assessment and comparing of hospital performance using “Accreditation Pattern”, “Organizational Excellence Pattern “and Program Chain (IPOCC) Patterns.  Methods: This descriptive-comparative study was conducted in 2019 in the ed...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011